Picture for Xiaoyang Qu

Xiaoyang Qu

HMPO: Hybrid Median-length Policy Optimization for Chain-of-Thought Compression

Add code
Jun 01, 2026
Viaarxiv icon

DIVA: Harnessing the Representation Divergence in Unified Multimodal Models for Mutual Reinforcement

Add code
May 25, 2026
Viaarxiv icon

WindowQuant: Mixed-Precision KV Cache Quantization based on Window-Level Similarity for VLMs Inference Optimization

Add code
May 04, 2026
Viaarxiv icon

VLA-InfoEntropy: A Training-Free Vision-Attention Information Entropy Approach for Vision-Language-Action Models Inference Acceleration and Success

Add code
Apr 07, 2026
Viaarxiv icon

Vista: Scene-Aware Optimization for Streaming Video Question Answering under Post-Hoc Queries

Add code
Feb 09, 2026
Viaarxiv icon

From Knowing to Doing Precisely: A General Self-Correction and Termination Framework for VLA models

Add code
Feb 02, 2026
Viaarxiv icon

Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition

Add code
Feb 02, 2026
Viaarxiv icon

Triage: Hierarchical Visual Budgeting for Efficient Video Reasoning in Vision-Language Models

Add code
Jan 30, 2026
Viaarxiv icon

MIRRORTALK: Forging Personalized Avatars Via Disentangled Style and Hierarchical Motion Control

Add code
Jan 30, 2026
Viaarxiv icon

MiTa: A Hierarchical Multi-Agent Collaboration Framework with Memory-integrated and Task Allocation

Add code
Jan 30, 2026
Viaarxiv icon